Effect of congestion costs on shortest paths through complex networks 
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We analyze analytically the effect of congestion costs within a physically relevant, yet exactly 
solvable network model featuring central hubs. These costs lead to a competition between centralized 
and decentralized transport pathways. In stark contrast to conventional no-cost networks, there now 
exists an optimal number of connections to the central hub in order to minimize the shortest path. 
Our results shed light on an open problem in biology, informatics and sociology, concerning the 
extent to which decentralized versus centralized design benefits real-world complex networks. 
PACS numbers: 87.23. Ge, 05.70.Jk, 64.60.Fr, 89. 75. He 



The interplay between structure and function in com- 
plex networks, has become a major research topic in 
physics, biology, informatics, and sociology 0, 0, 0, 0, 
0, El 0- For example, the very same links, nodes and 
hubs that help create short-cuts in space for transport, 
may become congested due to increased traffic yielding 
an increase in transit time 0. Unfortunately there are 
very few analytic results available concerning network 
congestion and optimal pathways in real-world networks 

In this paper, we provide exact analytic results for the 
effects of congestion costs in networks with a combined 
ring-and-star topology. Figure 1(a) shows an example of 
our model network with N = 1 central hub. In addition 
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to the fact that it is analytically tractable andposseses 
a topology which is distinct from Refs. jj, y, la, bj , our 
model network is of direct relevance to a wide range of 
biological, computational and socio-economic systems in 
which there is a potentially congested central node(s). 
Figure 1 (b) shows the nutrient transport in a laboratory- 
grown fungus 0. The major transport pathways pass 
through a central hub (i.e. centralized transport) with 
some minor pathways around it (i.e. decentralized trans- 
port). It is an important yet open question in biology as 
to how organisms such as fungi make a trade-off between 
centralized and decentralized transport, communication 
and control. A related scenario with a similar topology, 
concerns the new congestion charge scheme in London 
which aims to dissuade drivers from passing through the 
central zone. Airlines must balance the costs and ben- 
efits of stopovers at major, yet potentially overcrowded, 
airport hubs. Similar trade-offs between centralized and 
decentralized routing, communication and control arise 
in data networks, manufacturing supply-chains, and gov- 
ernment. Even for crime or terrorist networks, one can 
ask how the Mafia's approach of passing all decisions 
through a central 'Godfather' compares to the appar- 
ently headless form of modern terrorist cells. More gen- 
erally, our model network could be used to describe clus- 
ters or motifs within larger networks in which relatively 
isolated hubs are connected to lower-connectivity nodes 
(e.g. scale- free network). 

Our model represents a generalization of Ref. to the 




FIG. 1: (Color online) (a) Our model network showing trans- 
port pathways through the central hub (connections of length 
1/2 denoted by thick lines) and around the ring (connections 
of length 1 denoted by thin lines) . Graph shows average short- 
est path length between any two nodes in a n — 1000 node 
ring, with a cost-per-connection to the hub of k = 1. There is 
an optimal value for the number of connections (p = pn ~ 44) 
such that the average shortest path length £ is a minimum. 
We denote this minimal shortest path length as £ = £\ m i u . 
(b) Photon scintillation image showing the nutrient distri- 
bution within a laboratory-grown fungus Phanerochaete ve- 
lutina. Nutrient density increases going from blue to green to 
red. 



case of non-zero congestion costs. Each of the n nodes 
around the ring is connected to its nearest neighbors by 
a link of unit length. These links are directed in the 'di- 
rected' model, and undirected in the 'undirected' model. 
With a probability p any node can be attached to the 
central hub by a link of length i. The links to the hub 
are always undirected. For both the directed and undi- 
rected models, explicit expressions can be derived for the 
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probability P(£,m) that the shortest path between any 
two nodes on the ring is £, given that they are separated 
around the ring by length m. Summing over all m for a 
given £ and dividing by (n— 1) yields the probability P(£) 
that the shortest path between two randomly selected 
nodes is of length £. The average value for the shortest 
path across the network is then I — Y^iZi tP{t)- F° r 
the undirected model, the expressions are more cumber- 
some because there are more paths with the same length. 
However, defining nP{£) = Q(z,p) where p = pn and 
z = £/n, there is a simple relationship between the undi- 
rected and directed models in the limit n — > oo with 
(z,p) = 2Q dir (2z,p) 0. The models 
only differ in this limit by a factor of two: z — > 2z, with 
z now running from to 1/2. The results which follow 
were obtained by generalizing this procedure. 
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FIG. 2: Minimal shortest path length £\mi n (i.e. minimum 
value of i) as obtained from Eq. JSJ. (a) Optimal number of 
connections p = pn as a function of the cost-per-connection k 
to the hub. Results are shown for n — 1000 and n — 10000. 
(b) Optimal number of connections p as a function of the 
network size. Results are shown for k — 2 and k = 4. 

We add a cost c every time a path passes through the 
central hub. This cost c is expressed as an additional 
path-length, however it could also be expressed as a time 
delay or reduction in flow-rate for transport and supply- 
chain problems. We consider three cases: (1) constant 
cost c where c is independent of how many connections 
the hub already has, i.e. c is independent of how 'busy' 
the hub is; (2) linear cost c where c grows linearly with 
the number of connections to the hub, and hence varies 
as p = np] (3) nonlinear cost c where c grows with the 
number of pairs connected directly across the network, 
and hence varies as p 2 . 

For a general, non-zero cost c that is independent of £ 



and m, we can write (for a network with directed links): 

P(£,£<c) = -J— (1) 
n — 1 

P(£<m,£>c) = (t-ctfil-p?-*- 1 (2) 
e-c-i 

P(£ = m,£>c) = 1-p 2 (i-c)(l-p) (<_c)_1 (3) 

i—c=X 

Performing the summation gives: 

P(£ = m, £ > c) = (1 + {£ - c - l)p)(l - pf-"' 1 (4) 
The shortest path distribution is hence: 

r v^< c 

p (£) = < [1 + (t - c - l)p 

[ +(n-l -£){£- c)p 2 ](l - pf- ' 1 V£>c 

Using the same analysis for undirected links yields a 
simple relationship between the directed and undirected 
models. Introducing the variable 7 = ^ with z and p 
as before, we may define nP(£) = Q(z,j,p) and hence 
find in the limit p — > 0, n — > 00 that Q U ndir(z,J, p) — 
2Qdir{2z, 27, p). For a fixed cost, not dependent on net- 
work size or the connectivity, this analysis is straightfor- 
ward. Paths of length I < c are prevented from using 
the central hub, while for I > c the distribution P(l) is 
similar to that of Ref . . 

For linear costs, dependent on network size and connec- 
tivity and for N = 1 central hub, we can show that there 
exists a minimum value of the average shortest path £ as 
a function of the connectivity to the central hub. Hence 
there is an optimal number of connections to the central 
hub, in order to create the minimum possible average 
shortest path. We denote this minimal path length as 
£ = £\ m in- Such a minimum is in stark contrast to the 
case of zero cost per connection, where the value of £ 
would just decrease monotonically towards one with an 
increasing number of connections to the hub. We now cal- 
culate the average shortest path, I = Y^iZi %P{£), which 
yields: 

j (l-p) n - c [3 + (n-2-c)p\ 
p 2 (n- 1) 

p[2-2c + 2n-(c-l)(c-n)p] -3 c(c - 1) 
+ p 2 (n-l) + 2(n-l) ' d 

Figure 1 shows the functional form of z = — with a 
cost of 1 unit path-length per connection to the hub (i.e. 
c = knp = kp, with k = 1). The optimal number of con- 
nections in order that £ is a minimum is approximately 44 
and depends on n. The corresponding minimal shortest 
path 1*1 m in is approximately 85. An analytic expression 
for £\ m i n can be obtained by setting the differential of 
Eq. Q equal to zero. If n is very large, one can intro- 
duce a higher cost without compromising the minimal 
shortest path £| m in since in general the nodes are already 
much further from one another. We can also investigate 
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how many connections we should make for a given cost 
and network size, in order to achieve the minimum pos- 
sible shortest path £\ m i n . This is obtained by setting the 
differential of Eq. (JSJ) equal to zero and solving for p. 
Figure 2(a) shows analytic results for the optimal num- 
ber of connections which yield the minimal shortest path 
^ I min, as a function of the cost per connection for a fixed 
network size. Figure 2(b) shows analytic results for the 
optimal number of connections which yield the minimal 
shortest path ^| m i n , as a function of the network size for 
a fixed cost per connection to the hub. 

To gain insight into the underlying physics, we now 
make some approximations to the exact analytic expres- 
sions. For large n, or more importantly large n — c, the 
term (1— p) n ~ c — > e~ p in Eq. JSJ. Provided that the cost 
per connection to the hub is not too high, the region con- 
taining the minimal shortest path ^| m i n will be at a rea- 
sonably high p (recall Fig. 1(a)). Hence we can neglect 
the exponential term and differentiate to find the mini- 
mum value of £ with c = knp = kp. ft is reasonable to 
assume that at fixed k, optimal p will increase with n like 
n x where < x < 1 . In particular, one obtains diffusive 

behavior whereby x ~ 1/2. Specifically, p \J~^- For a 

large network (i.e. large n), we have therefore obtained 
a simple relationship between the number of connections 
one should introduce in order to create the minimal aver- 
age shortest path between any two nodes in the network, 
and the cost per connection to the hub. It can be shown 
by comparing to Figure 2, that this analytic scaling rela- 
tion is accurate even down to n ~ 10, but is particularly 
good for n larger than 10 3 . 

Now we briefly turn to consider a specific yet phys- 
ically reasonable example of non- linear costs, in which 
the costs are taken to depend on the number of pairs 
which are connected via the hub. In particular, we use 
c = k(np) 2 . We obtain the analytic relationship p ps ^/W 
which is the non- linear equivalent of the above result. 
Obviously, more accurate expressions can be obtained 
since we know the complete form of the analytic solution 
- however these are too cumbersome algebraically to be 
presented here. 

For linear costs, the lowest value of £ one can achieve 
is ^|min ~ V&kn. Setting n = I0 3 and k = 1 gives 
£\min = 89.4, which agrees well with the exact analytic 
result shown in Fig. 1 . For non-linear costs, the minimal 
shortest path £\ m i n « \ / 27kn 2 . These last results show 
that the minimal shortest path ^| m i n across the network 
grows like when we impose linear costs while it grows 
like 713 when we put a cost on the number of direct con- 
nections between nodes made via the hub (i.e. non-linear 
costs). Corresponding results for the undirected model 
can be easily obtained from the equations for the directed 
model. For example for linear costs c = knp and undi- 
rected links, we obtain £\ m i n ~ V '4kn and p ss for 
the minimal shortest path and the optimal connectivity. 
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FIG. 3: Examples of the scaled probability distribution for a 
network with TV = 2 hubs, where the two hubs have associated 
costs for travelling through them. In (a), p p = 20 and p q = 10 
while the costs are c p = 0.15 and c q = 0.05. In (b), p p — 50 
and p q = 10 while the costs are c p = 0.35 and c q = 0.05. 



The present analysis can be extended to multiple hubs, 
N > 2. For simplicity, we focus here on the specific exam- 
ple of constant costs and N = 2 (i.e. hub P, with nodes 
connected to it with probability p and hub Q, with nodes 
connected to it with probability q) where the cost associ- 
ated with each hub has value c p and c q , with c p > c q . The 
cost for using both hubs is assumed to be infinite. It is 
not hard to imagine real-world systems that employ mul- 
tiple central hubs but which would not favour pathways 
through more than one at a time (e.g. an airline passen- 
ger would avoid buying a ticket with two stop-overs). Of 
course this assumption may not always be realistic (see, 
for example, Ref. |Toj| ~) . 

We first consider what happens when £ > c p > c q . In 
this case, both hubs may be used and we may therefore 
write: 

t-Cq-l 

P(l < m) = P P (l < m,i > c p )[l - ^ PQ(i <m,i> c q )] 

i — C q — l 

e-c p -i 

+ Pq(£ < m,£ > c q )[l - ^2 Pp(i <m,i> Op)] 

i — Cp — 1 

- P P (£ <m,£> c p )P Q {£ <m,£> c q ) (6) 

where Pp{£ < m,£ > c p ) and Pq(£ < m,£ > c q ) are un- 
derstood to be P(£ < m,£ > c) from the single-hub-with- 
costs case for probabilities p and q respectively. Substi- 
tuting Eq. J2| into the first term of Eq. © and perform- 
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ing the summation yields: 

e-c q -i 

P P (£ < m,£ > Cp)[l - ^ Pq{i <m,i> c q )] 



% V q + QlpqV + g2pq£ 2 )(a p a q 



(7) 



where 



90pq 



l-p 
l-q 



(1-P)^(l-!PA((C, + Ik-1) 



Qipq = (l-p)" Cp (l-g)" C9 p 2 (l-(c P + c, + l)g) 
92 vq = (l- P r c "(l-grV<7 . 

An equivalent substitution and summation performed 
on the second term in Eq. JBJ yields the same answer 
but with labels p and q interchanged. The third term, 
after substitution and summation, yields: 



P P (£ <m,£> c p )P Q (£ <m,£> c q ) 

= (ho + hit + h 2 £ 2 )(a p a q ) e ~ 1 



(8) 



where 

h = (l-p)- c "(l- g )-V?V g 

h 2 = (i-p)- e "(i- 9 )-V? 2 • 

Substitution of these individual terms into Eq. 
yields: 

P{£ <m) = (g' + g'xl + ^(OpO,)' -1 



(G) 



(9) 



where g[ = gi pq + gi qp — hi . To calculate the full proba- 
bility distribution for the case £ > c p > c q we now only 
require P(£ = m): 



e-i 



P(£ = m) = l- PQ{i<m)- ^ P{i < m) (10) 



where Pq{i < m) is the single-hub-plus-costs distribu- 
tion for a hub with probability q and P(i < m) is given 
by Eq. ©. We define the following functions: 



fx(a,n) 



E-x i-1 
1 a 



f x (a,m,n2) = fx(a,m) - f x (a,n 2 ) . 

We then substitute Pq{i < m) and P(i < m) into Eq. 
(tTUll yielding: 



P{£ = m,£> Cp)=l- 



■[fi(a q ,c p + l,c q + 1) 



-c q f (a q ,c p + l,c q + 1)] - [g' Q fo{a p a q ,£, c p + 1) 
+g'i.fi(a q a p ,£,c p + 1) + g 2 f 2 {a p a q ,£,c p + 1)] .(11) 

We now obtain the final distribution by performing the 
sum over m: 



P{£, c q <£<c p 



1)9 



+ (n-l-£)(£-c q )q 2 ](l- q y 



(13) 



P(£,c p <£) = 



n-l 



1 - 



■[/iK,< 



(1-qY 

c q fo(a q , c p + l,c q + 1)] - [g' fo(a p a q ,£, c v + 1) 
+g' 1 f 1 (a q a p ,£ y c p + 1) + g' 2 h(a p a q ,£,c v + 1)] 



+ [(n - 1 - 0(9& + ff^ + ^ 2 )K%)' _1 ] 



(.14) 



The resulting distribution, which has an interesting 
multi-modal form, is plotted in Fig. 3 for the directed 
case: Q now depends on five variables due to the ad- 
ditional probability q and cost c q , such that p p = pn, 
p q = qn, 7 p = — , 7 9 = — with z as before. Interestingly 
if the value of p q increases above p p the distribution tends 
to the single hub case extremely quickly - i.e. the P-hub 
is then barely used. If the P-hub has a high degree and 
a high cost, then the distribution behaves as though the 
P-hub is not there until t > j p , where it quickly falls 
to zero. The undirected case is similar to the directed 
case since once again the same scaling relationship exists 
between them. 

In summary, we have presented analytic results for 
a simple yet realistic model of a congested network. 
Elsewhere we will discuss embedding our iV-hub clus- 
ter within larger and more complex networks, and will 
present a quantitative comparison to the transport rout- 
ings observed within laboratory-grown fungi in an at- 
tempt to understand 'costs' within biological networks. 

N.F.J, is grateful to P.M. Hui and F.J. Rodriguez for 
discussions, and thanks M. Tlalka, S.C. Watkinson, P.R. 
Darrah and M.D. Fricker for permission to use the image 
in Fig. 1(b). 
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